Fast deterministic detector sampling by rafaelha · Pull Request #78 · QuEraComputing/tsim

rafaelha · 2026-03-25T21:16:33Z

This PR contains the Clifford detector sampling fast path used in https://arxiv.org/pdf/2604.01059

Instead of compiling graphs into parametric expressions, one can read of the Tanner graph directly for deterministic detectors.

This significantly speeds up Clifford-only circuits (non-Clifford circuits are only slightly sped up). For example, for the circuits from the Clifft paper I got the following updated numbers:

Note that r=1 coherent noise circuits from Clifft become Clifford-only circuits after ZX reduction in the Tsim compilation step.

import tsim
import time


def benchmark_detector_sampler(path: str, n: int) -> None:
    c = tsim.Circuit.from_file(path)
    sampler = c.compile_detector_sampler()

    batch_size = n
    sampler.sample(n, batch_size=batch_size)

    start = time.perf_counter()
    samples = 0
    duration = 0.0
    while duration < 5:
        sampler.sample(n, batch_size=batch_size)
        samples += n
        duration = time.perf_counter() - start

    samples_per_second = samples / duration
    print(f"{samples_per_second / 1e6:.1f} M samples/s")


benchmark_detector_sampler("surface7.stim", 10_000_000)
benchmark_detector_sampler("surface3_coherent.stim", 100_000_000)
benchmark_detector_sampler("surface5_coherent.stim", 1_000_000)

…ted in https://github.com/kh428/accel-cutting-magic-state

…ng for components determined by single f-variables.

github-actions · 2026-03-25T21:18:24Z

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
2296	2201	96%	0%	🟢

New Files

No new covered files...

Modified Files

File	Coverage	Status
src/tsim/circuit.py	94%	🟢
src/tsim/compile/pipeline.py	100%	🟢
src/tsim/core/graph.py	91%	🟢
src/tsim/core/types.py	100%	🟢
src/tsim/sampler.py	90%	🟢
TOTAL	95%	🟢

updated for commit: acbece1 by action🐍

github-actions · 2026-03-25T21:21:35Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-05-19 17:58 UTC

…faelha/fast_deterministic_detector_sampling

Roger-luo · 2026-04-17T01:00:36Z

Hi! Could you add a bit more context to this issue? A description of the expected behavior, use case, or any relevant details would help us prioritize and implement it. Thanks!

…faelha/fast_deterministic_detector_sampling

github-actions · 2026-05-19T14:46:29Z

☂️ Code Coverage

current status: ✅

Overall Coverage

Statements	Covered	Coverage	Threshold	Status
2577	2495	97%	0%	🟢

New Files

No new covered files...

Modified Files

File	Coverage	Status
src/tsim/circuit.py	97%	🟢
src/tsim/compile/pipeline.py	100%	🟢
src/tsim/core/graph.py	91%	🟢
src/tsim/core/types.py	100%	🟢
src/tsim/sampler.py	94%	🟢
TOTAL	96%	🟢

updated for commit: 6c72a81 by action🐍

- Introduced a zero-copy fast path in `_CompiledSamplerBase` to optimize sampling when direct f-indices are contiguous, without flips or reindexing. - Updated `compile_program` to consolidate direct entries into a single list, enhancing clarity and efficiency in handling direct components. - Simplified the logic for sorting direct entries and adjusted the output order to align with the original layout, reducing potential reindexing at sample time. - Enhanced the `transform_error_basis` function to streamline the detection of phase variables, improving overall graph processing efficiency.

- Added a new test for the detector sampler that verifies it returns empty arrays when no detectors are present, both with and without a reference sample. - Removed outdated test cases that were previously causing crashes due to empty concatenation.

rafaelha · 2026-05-19T15:19:44Z

@codex review again

Copilot

Pull request overview

This PR adds a fast path for detector sampling when connected components are deterministically given by a single error-basis variable (f_i), avoiding the JAX compilation + autoregressive sampling pipeline for those components. This targets faster detector sampling on low-noise surface-code-like circuits (per the linked paper).

Changes:

Detect “direct” connected components (single output equal to one f variable, optionally flipped) and represent them explicitly in CompiledProgram.
Add a pure-NumPy sampling fast path for programs consisting only of direct components, plus precomputed output reindexing to avoid per-sample argsort.
Add a unit test for reference sampling with zero detectors, a benchmark-style test, and documentation/changelog updates.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`src/tsim/core/graph.py`	Adds direct-component classification and prioritizes output-adjacent parameter vertices during error-basis transform.
`src/tsim/compile/pipeline.py`	Splits compilation into direct vs compiled components; precomputes output permutation (`output_reindex`).
`src/tsim/core/types.py`	Extends `CompiledProgram` to carry `direct_f_indices`, `direct_flips`, and `output_reindex`.
`src/tsim/sampler.py`	Uses direct bits in `sample_program`; adds NumPy-only `_sample_direct` fast path; updates `probability_of`.
`src/tsim/circuit.py`	Documents the new detector-sampler fast path behavior.
`test/unit/test_sampler.py`	Adds regression test for `use_detector_reference_sample=True` with no detectors.
`test/integration/test_sampler.py`	Updates `CompiledProgram` construction to include new required fields.
`test/unit/benchmarks/test_classical_detector_sampling.py`	Adds a performance threshold test (currently problematic for CI).
`CHANGELOG.md`	Documents the new detector-sampler fast path under Unreleased.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

chatgpt-codex-connector · 2026-05-19T15:28:47Z

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

rafaelha · 2026-05-19T15:48:48Z

Code review

Found 1 issue:

The new benchmark file lives under test/unit/benchmarks/ and will be picked up by the default pytest collection (testpaths = "test/" in pyproject.toml), so CI will run a shots = 10_000_000 sample on every PR. The assertion uses an absolute 5e-8 s/shot threshold that is hardware-dependent and will be flaky across runners. Consider gating with a marker (e.g. @pytest.mark.benchmark excluded by default), reducing shot count, or removing the hard threshold. Note: Copilot already flagged this inline on the same PR.

tsim/test/unit/benchmarks/test_classical_detector_sampling.py

Lines 8 to 36 in 746b41f

    
           def test_classical_detector_sampling_time_per_shot(): 
        
               """At p=1e-6 the detector sampler should produce shots faster than 5e-8 s/shot.""" 
        
               d = 7 
        
               p = 1e-6 
        
               shots = 10_000_000 
        
               stim_circuit = stim.Circuit.generated( 
        
                   "surface_code:rotated_memory_z", 
        
                   distance=d, 
        
                   rounds=d, 
        
                   before_round_data_depolarization=p, 
        
                   before_measure_flip_probability=p, 
        
                   after_clifford_depolarization=p, 
        
                   after_reset_flip_probability=p, 
        
               ) 
        
               tc = tsim.Circuit(str(stim_circuit)) 
        
               sampler = tc.compile_detector_sampler() 
        
               # Warm up JIT compilation 
        
               sampler.sample(shots=shots) 
        
               t0 = time.perf_counter() 
        
               sampler.sample(shots=shots) 
        
               sample_time_s = time.perf_counter() - t0 
        
               time_per_shot = sample_time_s / shots 
        
               assert ( 
        
                   time_per_shot < 5e-8 
        
               ), f"Time per shot {time_per_shot * 1e6:.4f} us exceeds 5e-8 s budget"

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

PR QuEraComputing#78 moved the matmul_gf2 + prod-over-T calls into per-family modules in terms.py, which dropped the dispatch sites the prior optimization commit relied on. This commit re-plumbs them. terms.py - Per-module CSR fields (NodePhases.params_csr, HalfPiPhases.params_csr, PiProducts.psi_csr/phi_csr, PhasePairs.alpha_csr/beta_csr), static so jit caches by instance identity (D, TSIM_GF2MM_BACKEND=cust_spdn / cust). - Lazy prod-over-T helpers lifted from the pre-refactor evaluate.py: _lazy_node_phases_prod / _lazy_phase_pairs_prod (B, TSIM_TERM_VALS_LAZY=1), and split-coeffs variants (B + SPLIT_COEFFS=1). - Cust prod-over-T kernels: _cust_node_phases_prod / _cust_phase_pairs_prod (C-prod, TSIM_TERM_VALS_CUST_KERNEL=1). - Complex-backend prod-over-T helpers with unroll=True scan and optional c128 cust kernel (H/H2, TSIM_SCALAR_BACKEND=complex128). - Dispatch funnels (_node_phases_prod_dispatch / _phase_pairs_prod_dispatch) resolve flag precedence: complex → cust → split → lazy → original path. - PiProducts.evaluate: cast sum_exponents to int32 before the `1 - 2*x` step. Without the cast, uint promotion wraps (255 in uint8, 2^64-1 in uint64 under JAX_ENABLE_X64=1) and `signed * _IDENTITY` carries garbage. compile.py - Each _compile_* helper calls build_params_csr on its (G, T, P) bitmask and forwards the result. build_params_csr returns None when cust_jax isn't available or G*T < TSIM_GF2MM_MIN_GT, so non-cust builds are unaffected. evaluate.py - make_summands replaces ExactScalarArray for static_phases / float_factor so the choice of scalar backend is centralized. - Complex case folds 2^power2 at to_complex() time (ComplexScalarArray has no power axis to absorb the factor). exact_scalar.py - _USE_COMPLEX128 + make_summands exposed for terms.py and evaluate.py. linalg.py - matmul_gf2 takes optional csr arg. When the csr is non-None and a cust backend is selected, dispatches to matmul_gf2_csr_ffi (cust) or matmul_gf2_csr_spdn_ffi (cust_spdn). Falls back to float32 matmul otherwise. Bit-exact in both paths. Tests: - Default (no flags): 777/777 pass. - B (TSIM_TERM_VALS_LAZY=1): 777/777 pass. - B + SPLIT_COEFFS=1: 777/777 pass. - D fallback (no cust_jax): 777/777 pass. - H (JAX_ENABLE_X64=1 + TSIM_SCALAR_BACKEND=complex128): 776/777 pass — test_seed asserts an exact-count sample histogram which differs by 2 under c128 vs ESA arithmetic (also fails on plain upstream + x64; not introduced here). C-prod / H2 cust kernels require cust_jax and a GPU; the dispatch is in place but not exercised in this env. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rafaelha added 3 commits February 26, 2026 21:07

Implement sparse channel sampler via geometric distribution as sugges…

28d8e51

…ted in https://github.com/kh428/accel-cutting-magic-state

Enhance circuit compilation and sampling by introducing direct handli…

206bd00

…ng for components determined by single f-variables.

Numpy fast path

8a0e0e4

rafaelha marked this pull request as draft March 25, 2026 21:45

rafaelha added 9 commits April 1, 2026 19:22

Merge branch 'main' of https://github.com/QuEraComputing/tsim into ra…

688c9f8

…faelha/fast_deterministic_detector_sampling

Merge branch 'main' of https://github.com/QuEraComputing/tsim into ra…

a6c43b4

…faelha/fast_deterministic_detector_sampling

Fix return type in _CompiledSamplerBase to ensure boolean output

1120c19

Merge branch 'main' of https://github.com/QuEraComputing/tsim into ra…

6daaaae

…faelha/fast_deterministic_detector_sampling

Change order (to reduce diff)

dacaaf3

Change to bool

964c123

Move classify_direct into helper file

92ba339

Improve variable naming

ca53b2a

Add handling for circuits with no outputs in sampler

acbece1

Remove performance bottleneck

400766b

rafaelha force-pushed the rafaelha/fast_deterministic_detector_sampling branch from 6197df5 to 400766b Compare April 29, 2026 15:05

Merge branch 'main' of https://github.com/QuEraComputing/tsim into ra…

eadd0bf

…faelha/fast_deterministic_detector_sampling

rafaelha added 3 commits May 19, 2026 11:08

Add benchmark test for classical detector sampling performance

3b3045f

Update changelog

6301668

rafaelha marked this pull request as ready for review May 19, 2026 15:14

rafaelha requested a review from Copilot May 19, 2026 15:19

Copilot started reviewing on behalf of rafaelha May 19, 2026 15:19 View session

Copilot AI reviewed May 19, 2026

View reviewed changes

Comment thread test/unit/benchmarks/test_classical_detector_sampling.py Outdated

Remove flaky test

6c72a81

rafaelha merged commit b25cb6e into main May 19, 2026
10 checks passed

rafaelha deleted the rafaelha/fast_deterministic_detector_sampling branch May 19, 2026 17:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast deterministic detector sampling#78

Fast deterministic detector sampling#78
rafaelha merged 19 commits into
mainfrom
rafaelha/fast_deterministic_detector_sampling

rafaelha commented Mar 25, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

Roger-luo commented Apr 17, 2026

Uh oh!

github-actions Bot commented May 19, 2026 •

edited

Loading

Uh oh!

rafaelha commented May 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented May 19, 2026

Uh oh!

rafaelha commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rafaelha commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

Uh oh!

github-actions Bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Roger-luo commented Apr 17, 2026

Uh oh!

github-actions Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

☂️ Code Coverage

Overall Coverage

New Files

Modified Files

Uh oh!

rafaelha commented May 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented May 19, 2026

Uh oh!

rafaelha commented May 19, 2026

Code review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rafaelha commented Mar 25, 2026 •

edited

Loading

github-actions Bot commented Mar 25, 2026 •

edited

Loading

github-actions Bot commented Mar 25, 2026 •

edited

Loading

github-actions Bot commented May 19, 2026 •

edited

Loading